Neural Network Task Decomposition Based on Output Partitioning

نویسندگان

  • Sheng-Uei Guan
  • Shanchun Li
  • Kiat Tan
چکیده

In this paper, we propose a new method for task decomposition based on output partitioning. The proposed method is able to find the appropriate architectures for largescale real-world problems automatically and efficiently. By using this method, a problem can be divided flexibly into several sub-problems as chosen, each of which is composed of the whole input vector and a fraction of the output vector. Each module (for each subproblem) is responsible for producing a fraction of the output vector of the original problem. Hence, the hidden structure for the original problem’s output units is decoupled. These modules can be grown and trained in sequence or in parallel. Incorporated with the constructive learning algorithm, our method does not require excessive computation and any prior knowledge concerning decomposition. The feasibility of output partitioning is analyzed and proved. Several benchmarks are implemented to test the validity of this method. Their results show that this method can reduce computation time, increase learning speed, and improve generalization accuracy for both classification and regression problems. BACKGROUND Multilayered feedforward neural networks are widely used for pattern classification, function approximation, prediction, optimization, and regression problems. When applied to larger-scale real-word problems (tasks), they still suffer some drawbacks, such as, the inefficiency in utilizing network resources as the task (and the network) gets larger, and the inability of the current learning schemes to cope with high-complexity tasks (G. Auda, M. Kamel and H. Raafat, 1996). Large networks tend to introduce high internal interference because of the strong coupling among their hidden-layer weights (R. A. Jacobs et. al., 1991). Internal interference exists during the training process: whenever updating the weights of hidden units, the influence (desired outputs) from several output units will cause the weights to compromise to non-optimal values due to the clash in their weight updating directions. A natural approach to overcome these drawbacks is to decompose the original task into several sub-tasks based on the “divide-and-conquer” technique. Up to now, various task decomposition methods have been proposed. These methods can be roughly classified into the following classes: 1) Functional Modularity. Different functional aspects in a 1 Department of Electrical and Computer Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260 * [email protected] Journal of The Institution of Engineers, Singapore Vol. 44 Issue 3 2004 79 task are modeled independently and the complete system functionality is obtained by the combination of these individual functional models (R. E. Jenkins and B. P. Yuhas, 1993). 2) Domain Decomposition. The original input data space is partitioned into several subspaces and each module (for each sub-problem) is learned to fit the local data on each sub-space, for example, the mixture of experts architecture (R. A. Jacobs et. al., 1991), and multi-sieving neural network (B. L. Lu et. al. 1994). 3) Class Decomposition. A problem is broken down into a set of sub-problems according to the inherent class relations among training data (B. L. Lu and M. Ito, 1999; R. Anand et. al., 1995). 4) State Decomposition. Different modules are learned to deal with different states in which the system can be at any time (V. Petridis and A. Kehagias, 1998). Class decomposition methods have been proposed for solving N-class problems. The method proposed in (R. Anand et. al., 1995) is to split an N-class problem into N twoclass sub-problems and each module is trained to learn a two-class sub-problem. Another method proposed in (B. L. Lu and M. Ito, 1999) divides an N-class problem into (N 2) two-class sub-problems. Each of the two-class sub-problems is learned independently while the existence of the training data belonging to the other N-2 classes is ignored. The final overall solution is obtained by integrating all of the trained modules into a min-max modular network. There are still some shortcomings to these proposed class decomposition methods. Firstly, these algorithms use predefined network architecture for each module to learn each subproblem. Secondly, these methods are only applied to classification problems. A more general approach applicable to not only classification problems but also other applications, such as regression, should be explored. Thirdly, they usually divide the problem into a set of two-class sub-problems. This will be an obvious limitation: when they are applied to a large-scale and complex N-class problem where N is large, a very large number of two-class sub-problems will have to be learned. In this paper, we propose a new and more general task decomposition method based on output partitioning to overcome these shortcomings mentioned above. In section 2, we will briefly describe our design goals. Then, the proposed task decomposition method will be depicted in section 3. The procedure for parallel growing and result merging will be illustrated in section 4. The experiments are implemented and analyzed in section 5. In section 6, we present an automatic output partition procedure for classification problems. Conclusions are presented in section 7. DESIGN GOALS In order to reduce excessive computation, increase learning speed, and improve generalization accuracy, the proposed method should meet the following design goals. Design goal 1: Instead of using predefined network structure, the neural network must automatically grow to an appropriate size without excessive computation. It is one of key issues in neural network design to find appropriate network architecture automatically for a given application and optimize the set of weights for the architecture. Constructive Journal of The Institution of Engineers, Singapore Vol. 44 Issue 3 2004 80 learning algorithms (T. Y. Kwok and D. Y. Yeung, 1997) can tackle this problem. Constructive learning algorithms start with a small network and then grow additional hidden units and weights until a satisfactory solution is found. In this paper, we adopt the Constructive Backpropagation (CBP) algorithm (M. Lehtokangas, 1999). The reason why CBP is selected is that the implementation of CBP is simple and we only need to backpropagate the output error through one and only one hidden layer. This way the CBP algorithm is computationally as efficient as the popular Cascade Correlation (CC) algorithm (M. Lehtokangas, 1999; S. E. Fahlman and C. Lebiere, 1990). Design goal 2: Flexible decomposition method. We can decompose the original problem into a number of sub-problems as chosen (less than the number of output units). For a problem that has a high-dimensional output space, if we always split it into a set of single output sub-problems, the number of obtained modules will be very large. Instead, we can split it into a small number of modules each of which contains several output units. Another advantage of flexible decomposition is that sometimes we only want to know some portions of the results in the application. For example, for classification problems, there are some situations where we only want to find out whether the current pattern lies in some particular class or not. Design goal 3: A general decomposition method. The proposed method can be applied to not only classification problems but also regression problems with multiple output units. TASK DECOMPOSITION BASED ON OUTPUT PARTITIONING The decomposition of a large-scale and complex problem into a set of smaller and simpler sub-problems is the first step to implement modular neural network learning. Our approach is to split this complex problem with high-dimensional output space into a set of sub-problems with low-dimensional output spaces. Let I be the training set for a problem with K-dimensional output space: ( ) { } p p p T X 1 , = = I (1) where N p R X ∈ is the input vector of the pth training pattern, K p R T ∈ is the desired output vector for the pth training pattern, and P is the number of training patterns. Suppose we divide the original problem into s sub-problems, each with a Ki-dimensional (i = 1,2,...,s) output space: ( ) { } p i p p i T X 1 , = = I (2) where i K i p R T ∈ is the desired output vector of the pth training pattern for the ith subproblem. Journal of The Institution of Engineers, Singapore Vol. 44 Issue 3 2004 81 Each sub-problem can be solved by growing and training a feedforward neural network (module). A collection of such modules is the overall solution of the original problem. In the following, we present why this method works. Many different cost functions (also called error measures) can be used for network training. The most commonly used one is the sum of squared errors and its variations:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Utilizing a new feed-back fuzzy neural network for solving a system of fuzzy equations

This paper intends to offer a new iterative method based on articial neural networks for finding solution of a fuzzy equations system. Our proposed fuzzied neural network is a ve-layer feedback neural network that corresponding connection weights to output layer are fuzzy numbers. This architecture of articial neural networks, can get a real input vector and calculates its corresponding fuzzy o...

متن کامل

Neural Controller Design for Suspension Systems

The main problem of vehicle vibration comes from road roughness. An active suspension systempossesses the ability to reduce acceleration of sprung mass continuously as well as to minimizesuspension deflection, which results in improvement of tire grip with the road surface. Thus, braketraction control and vehicle maneuverability can be improved consider ably .This study developeda new active su...

متن کامل

An Efficient Algorithm for Output Coding in Pal Based Cplds (TECHNICAL NOTE)

One of the approaches used to partition inputs consists in modifying and limiting the input set using an external transcoder. This method is strictly related to output coding. This paper presents an optimal output coding in PAL-based programmable transcoders. The algorithm can be used to implement circuits in PAL-based CPLDs.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004